RAW SEQUENCE LISTING DATE: 06/30/97 

PATENT APPLICATION US/08/848,439 TIME: 14:29:20 

INPUT SET: SWM.raw 



This Raw Listing contains the General 
Information Section and up to the first 5 pages. 



SEQUENCE LISTING 



(1) General Information: 




(i) APPLICANT: LaVALLIE, EDWARD 

RACIE, LISA 

(ii) TITLE OF INVENTION: HUMAN SDF-5 PROTEIN AND COMPOSITIONS 
(iii) NUMBER OF SEQUENCES: 3 

(iv) CORRESPONDENCE ADDRESS: * 

(A) ADDRESSEE: GENETICS INSTITUTE, INC. 

(B) STREET: 87 CAMBRIDGEPARK DRIVE 

(C) CITY: CAMBRIDGE 

(D) STATE: MA 

(E) COUNTRY: USA 

(F) ZIP: 02140 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: LAZAR, STEVEN R. 

(B) REGISTRATION NUMBER: 32,618 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 498-8260 

(B) TELEFAX: (617) 876-5851 



(2) ^FORMATION FOR SEQ ID NO:l: 

- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2027 base pairs 

( B ) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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47 

48 (ii) MOLECULE TYPE : DNA (genomic) 

49 

50 

51 

52 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

53 

54 GAATTCGGCC TTCATGGCCT AGCTCATTCT GCTCCCCCGG GTCGGAGCCC CCCGGAGCTG 60 
55 

56 CGCGCGGGCT TGCAGCGCCT CGCCCGCGCT CCTCCCGGTG TCCCGCTTCT CCGCGCCCCA 120 
57 

58 GCCGCCGGCT GCCAGCTTTT CGGGGCCCCG AGTCGCACCC AGCGAAGAGA GCGGGCCCGG 180 
59 

60 GACAAGCTCG AACTCCGGCC GCCTCGCCCT TCCCCGGCTC CGCTCCCTCT GCCCCCTCGG 240 
61 

62 GGTCGCGCGC CCACGATGCT GCAGGGCCCT GGCTCGCTGC TGCTGCTCTT CCTCGCCTCG 300 
63 

64 CACTGCTGCC TGGGCTCGGC GCGCGGGCTC TTCCTCTTTG GCCAGCCCGA CTTCTCCTAC 360 
65 

66 AAGCGCAGCA ATTGCAAGCC CATCCCGGCC AACCTGCAGC TGTGCCACGG CATCGAATAC 420 
67 

68 CAGAACATGC GGCTGCCCAA CCTGCTGGGC CACGAGACCA TGAAGGAGGT GCTGGAGCAG 480 
69 

70 GCCGGCGCTT GGATCCCGCT GGTCATGAAG CAGTGCCACC CGGACACCAA GAAGTTCCTG 540 
71 

72 TGCTCGCTCT TCGCCCCCGT CTGCCTCGAT GACCTAGACG AGACCATCCA GCCATGCCAC 600 
73 

74 TCGCTCTGCG TGCAGGTGAA GGACCGCTGC GCCCCGGTCA TGTCCGCCTT CGGCTTCCCC 660 
75 

76 TGGCCCGACA TGCTTGAGTG CGACCGTTTC CCCCAGGACA ACGACCTTTG CATCCCCCTC 720 
77 

78 GCTAGCAGCG ACCACCTCCT GCCAGCCACC GAGGAAGCTC CAAAGGTATG TGAAGCCTGC 780 
79 

80 AAAAATAAAA ATGATGATGA CAACGACATA ATGGAAACGC TTTGTAAAAA TGATTTTGCA 840 
81 

82 CTGAAAATAA AAGTGAAGGA GATAACCTAC ATCAACCGAG ATACCAAAAT CATCCTGGAG 900 
83 

84 ACCAAGAGCA AGACCATTTA CAAGCTGAAC GGTGTGTCCG AAAGGGACCT GAAGAAATCG 960 
85 

86 GTGCTGTGGC TCAAAGACAG CTTGCAGTGC ACCTGTGAGG AGATGAACGA CATCAACGCG 1020 
87 

88 CCCTATCTGG TCATGGGACA GAAACAGGGT GGGGAGCTGG TGATCACCTC GGTGAAGCGG 1080 
89 

90 TGGCAGAAGG GGCAGAGAGA GTTCAAGCGC ATCTCCCGCA GCATCCGCAA GCTGCAGTGC 1140 
91 

92 TAGTCCCGGC ATCCTGATGG CTCCGACAGG CCTGCTCCAG AGCACGGCTG ACCATTTCTG 1200 

93 * 

94 CTCCGGGATC TCAGCTCCCG TTCCCCAAGC ACACTCCTAG CTGCTCCAGT CTCAGCCTGG 1260 

95 4$ 

96 GCAGCTTCCC CCTGCCTTTT GCACGTTTGC ATCCCCAGCA TTTCCTGAGT TATAAGGCCA 1320 

97 # 

98 CAGGAGTGGA TAGCTGTTTT CACCTAAAGG AAAAGCCCAC CCGAATCTTG TAGAAATATT 1380 
99 
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100 CAAACTAATA AAATCATGAA TATTTTTATG AAGTTTAAAA ATAGCTCACT TTAAAGCTAG 1440 
101 

102 TTTTGAATAG GTGCAACTGT GACTTGGGTC TGGTTGGTTG TTGTTTGTTG TTTTGAGTCA 1500 
103 

104 GCTGATTTTC ACTTCCCACT GAGGTTGTCA TAACATGCAA ATTGCTTCAA TTTTCTCTGT 1560 
105 

106 GGCCCAAACT TGTGGGTCAC AAACCCTGTT GAGATAAAGC TGGCTGTTAT CTCAACATCT 1620 
107 

108 TCATCAGCTC CAGACTGAGA CTCAGTGTCT AAGTCTTACA ACAATTCATC ATTTTATACC 1680 
109 

110 TTCAATGGGA ACTTAAACTG TTACATGTAT CACATTCCAG CTACAATACT TCCATTTATT 1740 
111 

112 AGAAGCACAT TAACCATTTC TATAGCATGA TTTCTTCAAG TAAAAGGCAA AAGATATAAA 1800 
113 

114 TTTTATAATT GACTTGAGTA CTTTAAGCCT TGTTTAAAAC ATTTCTTACT TAACTTTTGC 1860 
115 

116 AAATTAAACC CATTGTAGCT TACCTGTAAT ATACATAGTA GTTTACCTTT AAAAGTTGTA 1920 
117 

118 AAAATATTGC TTTAACCAAC ACTGTAAATA TTTCAGATAA ACATTATATT CTTGTATATA 1980 
119 

120 AACTTTACAT CCTGTTTTAC CTAAAAAAAA AAAAAAAAAG CGGCCGC 2027 
121 

122 (2) INFORMATION FOR SEQ ID NO: 2: 
123 

124 (i) SEQUENCE CHARACTERISTICS: 

125 (A) LENGTH: 295 amino acids 

126 (B) TYPE: amino acid 

127 (C) STRANDEDNESS: single 

128 (D) TOPOLOGY: linear 
129 

130 (ii) MOLECULE TYPE: protein 

131 

132 

133 

134 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

135 

136 Met Leu Gin Gly Pro Gly Ser Leu Leu Leu Leu Phe Leu Ala Ser His 

137 1 5 10 15 
138 

139 Cys Cys Leu Gly Ser Ala Arg Gly Leu Phe Leu Phe Gly Gin Pro Asp 

140 20 25 30 
141 

142 Phe Ser Tyr Lys Arg Ser Asn Cys Lys Pro lie Pro Ala Asn Leu Gin 

143 35 40 45 
144 

145 Leu Cys His Gly lie Glu Tyr Gin Asn Met Arg Leu Pro Asn Leu Leu 

146 50 55 60 • 
147 

148 Gly His Glu Thr Met Lys Glu Val Leu Glu Gin Ala <fily Ala Trp lie 

149 65 70 75 80 

150 m 

151 Pro Leu Val Met Lys Gin Cys His Pro Asp Thr Lys Lys Phe Leu Cys 

152 85 90 95 
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153 

154 Ser Leu Phe Ala Pro Val Cys Leu Asp Asp Leu Asp Glu Thr lie Gin 

155 100 105 110 
156 

157 Pro Cys His Ser Leu Cys Val Gin Val Lys Asp Arg Cys Ala Pro Val 

158 115 120 125 
159 

160 Met Ser Ala Phe Gly Phe Pro Trp Pro Asp Met Leu Glu Cys Asp Arg 

161 130 135 140 
162 

163 Phe Pro Gin Asp Asn Asp Leu Cys lie Pro Leu Ala Ser Ser Asp His 

164 145 150 155 160 
165 

166 Leu Leu Pro Ala Thr Glu Glu Ala Pro Lys Val Cys Glu Ala Cys Lys 

167 165 170 175 
168 

169 Asn Lys Asn Asp Asp Asp Asn Asp lie Met Glu Thr Leu Cys Lys Asn 

170 180 185 190 
171 

172 Asp Phe Ala Leu Lys lie Lys Val Lys Glu lie Thr Tyr lie Asn Arg 

173 195 200 205 
174 

175 Asp Thr Lys lie lie Leu Glu Thr Lys Ser Lys Thr lie Tyr Lys Leu 

176 210 215 220 
177 

178 Asn Gly Val Ser Glu Arg Asp Leu Lys Lys Ser Val Leu Trp Leu Lys 

179 225 230 235 240 
180 

181 Asp Ser Leu Gin Cys Thr Cys Glu Glu Met Asn Asp lie Asn Ala Pro 

182 245 250 255 
183 

184 Tyr Leu Val Met Gly Gin Lys Gin Gly Gly Glu Leu Val He Thr Ser 

185 260 265 270 
186 

187 Val Lys Arg Trp Gin Lys Gly Gin Arg Glu Phe Lys Arg He Ser Arg 

188 275 280 285 
189 

190 Ser He Arg Lys Leu Gin Cys 

191 290 295 
192 

193 (2) INFORMATION FOR SEQ ID NO: 3: 
194 

195 (i) SEQUENCE CHARACTERISTICS: 

196 (A) LENGTH: 275 amino acids 

197 (B) TYPE: amino acid 

198 (C) STRANDEDNESS: single 

199 (D) TOPOLOGY: linear • 
200 

201 (ii) MOLECULE TYPE: protein 

202 

203 # 

204 

205 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
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206 

207 Ser Ala Arg Gly Leu Phe Leu Phe Gly Gin Pro Asp Phe Ser Tyr Lys 

208 15 10 15 
209 

210 Arg Ser Asn Cys Lys Pro lie Pro Ala Asn Leu Gin Leu Cys His Gly 

211 20 25 30 
212 

213 lie Glu Tyr Gin Asn Met Arg Leu Pro Asn Leu Leu Gly His Glu Thr 

214 35 40 45 
215 

216 Met Lys Glu Val Leu Glu Gin Ala Gly Ala Trp lie Pro Leu Val Met 

217 50 55 60 
218 

219 Lys Gin Cys His Pro Asp Thr Lys Lys Phe Leu Cys Ser Leu Phe Ala 

220 65 70 75 80 
221 

222 Pro Val Cys Leu Asp Asp Leu Asp Glu Thr lie Gin Pro Cys His Ser 

223 85 90 95 
224 

225 Leu Cys Val Gin Val Lys Asp Arg Cys Ala Pro Val Met Ser Ala Phe 

226 100 105 110 
227 

228 Gly Phe Pro Trp Pro Asp Met Leu Glu Cys Asp Arg Phe Pro Gin Asp 

229 115 120 125 
230 

231 Asn Asp Leu Cys lie Pro Leu Ala Ser Ser Asp His Leu Leu Pro Ala 

232 130 135 140 
233 

234 Thr Glu Glu Ala Pro Lys Val Cys Glu Ala Cys Lys Asn Lys Asn Asp 

235 145 150 155 160 
236 

237 Asp Asp Asn Asp lie Met Glu Thr Leu Cys Lys Asn Asp Phe Ala Leu 

238 165 170 175 
239 

240 Lys lie Lys Val Lys Glu lie Thr Tyr lie Asn Arg Asp Thr Lys lie 

241 180 185 190 
242 

24 3 lie Leu Glu Thr Lys Ser Lys Thr lie Tyr Lys Leu Asn Gly Val Ser 

244 195 200 205 

245 

246 Glu Arg Asp Leu Lys Lys Ser Val Leu Trp Leu Lys Asp Ser Leu Gin 

247 210 215 220 
248 

249 Cys Thr Cys Glu Glu Met Asn Asp lie Asn Ala Pro Tyr Leu Val Met 

250 225 230 235 240 
251 

252 Gly Gin Lys Gin Gly Gly Glu Leu Val lie Thr Ser Val Lys Arg Trp 

253 245 250 255 

254 4* 

255 Gin Lys Gly Gin Arg Glu Phe Lys Arg lie Ser Arg Ser lie Arg Lys 

256 # 260 265 270 
257 

258 Leu Gin Cys 



SEQUENCE VERIFICATION REPORT 

PATENT APPLICATION US/08/848,439 



DATE: 06/30/97 
TIME: 14:29:32 



INPUT SET: S18704.raw 



Original Text 



